Sarcasm Detection on Czech and English Twitter
نویسندگان
چکیده
This paper presents a machine learning approach to sarcasm detection on Twitter in two languages – English and Czech. Although there has been some research in sarcasm detection in languages other than English (e.g., Dutch, Italian, and Brazilian Portuguese), our work is the first attempt at sarcasm detection in the Czech language. We created a large Czech Twitter corpus consisting of 7,000 manually-labeled tweets and provide it to the community. We evaluate two classifiers with various combinations of features on both the Czech and English datasets. Furthermore, we tackle the issues of rich Czech morphology by examining different preprocessing techniques. Experiments show that our language-independent approach significantly outperforms adapted state-of-the-art methods in English (F-measure 0.947) and also represents a strong baseline for further research in Czech (F-measure 0.582).
منابع مشابه
Putting Sarcasm Detection into Context: The Effects of Class Imbalance and Manual Labelling on Supervised Machine Classification of Twitter Conversations
Sarcasm can radically alter or invert a phrase’s meaning. Sarcasm detection can therefore help improve natural language processing (NLP) tasks. The majority of prior research has modeled sarcasm detection as classification, with two important limitations: 1. Balanced datasets, when sarcasm is actually rather rare. 2. Using Twitter users’ self-declarations in the form of hashtags to label data, ...
متن کاملTweet Sarcasm: Mechanism of Sarcasm Detection in Twitter
The www is growing at an alarming rate . Users have started participating actively on Internet by giving their opinions on products, services and blogs. Study and analysis of such opinions is known as Opinion Mining. But sometimes users prefer being sarcastic. Sarcasm is a linguistic phenomenon in which people state the opposite of what they actually mean. Sarcasm Detection is a challenging tas...
متن کاملContextualized Sarcasm Detection on Twitter
Sarcasm requires some shared knowledge between speaker and audience; it is a profoundly contextual phenomenon. Most computational approaches to sarcasm detection, however, treat it as a purely linguistic matter, using information such as lexical cues and their corresponding sentiment as predictive features. We show that by including extra-linguistic information from the context of an utterance ...
متن کاملMagnets for Sarcasm: Making Sarcasm Detection Timely, Contextual and Very Personal
Sarcasm is a pervasive phenomenon in social media, permitting the concise communication of meaning, affect and attitude. Concision requires wit to produce and wit to understand, which demands from each party knowledge of norms, context and a speaker’s mindset. Insight into a speaker’s psychological profile at the time of production is a valuable source of context for sarcasm detection. Using a ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014